656 research outputs found
Recurrent Multimodal Interaction for Referring Image Segmentation
In this paper we are interested in the problem of image segmentation given
natural language descriptions, i.e. referring expressions. Existing works
tackle this problem by first modeling images and sentences independently and
then segment images by combining these two types of representations. We argue
that learning word-to-image interaction is more native in the sense of jointly
modeling two modalities for the image segmentation task, and we propose
convolutional multimodal LSTM to encode the sequential interactions between
individual words, visual information, and spatial information. We show that our
proposed model outperforms the baseline model on benchmark datasets. In
addition, we analyze the intermediate output of the proposed multimodal LSTM
approach and empirically explain how this approach enforces a more effective
word-to-image interaction.Comment: To appear in ICCV 2017. See http://www.cs.jhu.edu/~cxliu/ for code
and supplementary materia
Problems in the design and operation of uncertain complex engineering systems
In this dissertation, we consider two problems. The first one is a general approach to the optimal design of uncertain dynamical systems, where the uncertainty is represented by a random parameter. The problem is formulated using two types of performance criteria, that result in two different optimal design methods. However, both of them are difficult to solve analytically for most uncertain complex dynamical systems. A numerical scheme is developed for the optimal design that involves two steps. First, in order to obtain a numerical algorithm for the optimal solution, we apply randomized algorithms for average performance synthesis to approximate the optimal solution. Second, using the properties of the Perron-Frobenius operator we develop an efficient computation approach for calculating the stationary distribution for the uncertain dynamical systems and the average performance criteria. The proposed approach is demonstrated through numerical examples. The second problem is a novel approach for evaluating the short-term Loss of Load Probability (LOLP) in power systems that include wind generation resources that vary stochastically in time. We firstly introduce a mathematical model for calculating the short-term LOLP, and then a novel quantitative measure of its behavior when converging to its steady-state level is derived. In addition, the corresponding empirical formulas are offered which can be used in practice to estimate the convergence time of LOLP under different conditions. Finally, an application of the outcomes of the analytical work in estimation of the dynamic behavior of short-term LOLP with an actual wind generation profile is presented to show the significance of the developed measures
High spatial-resolution imaging of label-free in vivo protein aggregates by VISTA
Amyloid aggregation, formed by aberrant proteins, is a pathological hallmark for neurodegenerative diseases, including Alzheimer's disease and Huntington's disease. High-resolution holistic mapping of the fine structures from these aggregates should facilitate our understanding of their pathological roles. Here, we achieved label-free high-resolution imaging of the polyQ and the amyloid-beta (Aβ) aggregates in cells and tissues utilizing a sample-expansion stimulated Raman strategy. We further focused on characterizing the Aβ plaques in 5XFAD mouse brain tissues. 3D volumetric imaging enabled visualization of the whole plaques, resolving both the fine protein filaments and the surrounding components. Coupling our expanded label-free Raman imaging with machine learning, we obtained specific segmentation of aggregate cores, peripheral filaments together with cell nuclei and blood vessels by pre-trained convolutional neural network models. Combining with 2-channel fluorescence imaging, we achieved a 6-color holistic view of the same sample. This ability for precise and multiplex high-resolution imaging of the protein aggregates and their micro-environment without the requirement of labeling would open new biomedical applications
Neural Collapse Inspired Federated Learning with Non-iid Data
One of the challenges in federated learning is the non-independent and
identically distributed (non-iid) characteristics between heterogeneous
devices, which cause significant differences in local updates and affect the
performance of the central server. Although many studies have been proposed to
address this challenge, they only focus on local training and aggregation
processes to smooth the changes and fail to achieve high performance with deep
learning models. Inspired by the phenomenon of neural collapse, we force each
client to be optimized toward an optimal global structure for classification.
Specifically, we initialize it as a random simplex Equiangular Tight Frame
(ETF) and fix it as the unit optimization target of all clients during the
local updating. After guaranteeing all clients are learning to converge to the
global optimum, we propose to add a global memory vector for each category to
remedy the parameter fluctuation caused by the bias of the intra-class
condition distribution among clients. Our experimental results show that our
method can improve the performance with faster convergence speed on
different-size datasets.Comment: 11 pages, 5 figure
Parameter-Efficient Multilingual Summarisation: An Empirical Study
With the increasing prevalence of Large Language Models, traditional full
fine-tuning approaches face growing challenges, especially in memory-intensive
tasks. This paper investigates the potential of Parameter-Efficient
Fine-Tuning, focusing on Low-Rank Adaptation (LoRA), for complex and
under-explored multilingual summarisation tasks. We conduct an extensive study
across different data availability scenarios, including full-data, low-data,
and cross-lingual transfer, leveraging models of different sizes. Our findings
reveal that LoRA lags behind full fine-tuning when trained with full data,
however, it excels in low-data scenarios and cross-lingual transfer.
Interestingly, as models scale up, the performance gap between LoRA and full
fine-tuning diminishes. Additionally, we investigate effective strategies for
few-shot cross-lingual transfer, finding that continued LoRA tuning achieves
the best performance compared to both full fine-tuning and dynamic composition
of language-specific LoRA modules
- …